pax_global_header 0000666 0000000 0000000 00000000064 15000135257 0014510 g ustar 00root root 0000000 0000000 52 comment=ad8f348b84d9af7d62c28175ffaf665693d989dc
google-ml_collections-ad8f348/ 0000775 0000000 0000000 00000000000 15000135257 0016413 5 ustar 00root root 0000000 0000000 google-ml_collections-ad8f348/.github/ 0000775 0000000 0000000 00000000000 15000135257 0017753 5 ustar 00root root 0000000 0000000 google-ml_collections-ad8f348/.github/ISSUE_TEMPLATE/ 0000775 0000000 0000000 00000000000 15000135257 0022136 5 ustar 00root root 0000000 0000000 google-ml_collections-ad8f348/.github/ISSUE_TEMPLATE/bug_report.md 0000664 0000000 0000000 00000001521 15000135257 0024627 0 ustar 00root root 0000000 0000000 ---
name: Bug report
about: Create a report to help us improve
title: ''
labels: bug
assignees: mohitreddy1996
---
**Describe the bug**
A clear and concise description of what the bug is.
**To Reproduce**
***ConfigDict***
Please consider providing a Colab link reproducing the issue.
***ConfigFlags***
Please consider providing a Colab link in case of `config_flags.DEFINE_config_dict` or a simple repository with a python script which has `config_flags.DEFINE_config_flags` and the config flag. Would be best the repository contains a README file with any setups and how to execute the script.
**Expected behavior**
A clear and concise description of what you expected to happen.
**Environment:**
- OS: [e.g. MacOS]
- OS Version: [e.g. 22]
- Python: [e.g. Python 3.6]
**Additional context**
Add any other context about the problem here.
google-ml_collections-ad8f348/.github/workflows/ 0000775 0000000 0000000 00000000000 15000135257 0022010 5 ustar 00root root 0000000 0000000 google-ml_collections-ad8f348/.github/workflows/pytest_and_autopublish.yml 0000664 0000000 0000000 00000003206 15000135257 0027325 0 ustar 00root root 0000000 0000000 name: Unittests & Auto-publish
# Allow to trigger the workflow manually (e.g. when deps changes)
on: [push, workflow_dispatch]
jobs:
pytest-job:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ['3.10', '3.11', '3.12']
timeout-minutes: 30
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}-${{ matrix.python-version }}
cancel-in-progress: true
steps:
- uses: actions/checkout@v4
# Install deps
- uses: actions/setup-python@v5
with:
python-version: "3.10"
# Uncomment to cache of pip dependencies (if tests too slow)
# cache: pip
# cache-dependency-path: '**/pyproject.toml'
- run: pip --version
- run: pip install -e .[dev]
- run: pip freeze
# Run tests (in parallel)
- name: Run core tests
run: |
# TODO(marcenacp): do not ignore tests.
pytest -vv -n auto ml_collections/ \
--ignore=ml_collections/config_dict/examples/examples_test.py
# Auto-publish when version is increased
publish-job:
# Only try to publish if:
# * Repo is self (prevents running from forks)
# * Branch is `master`
if: |
github.repository == 'google/ml_collections'
&& github.ref == 'refs/heads/master'
needs: pytest-job # Only publish after tests are successful
runs-on: ubuntu-latest
permissions:
contents: write
timeout-minutes: 30
steps:
# Publish the package (if local `__version__` > pip version)
- uses: etils-actions/pypi-auto-publish@v1
with:
pypi-token: ${{ secrets.PYPI_API_TOKEN }}
gh-token: ${{ secrets.GITHUB_TOKEN }}
google-ml_collections-ad8f348/.gitignore 0000664 0000000 0000000 00000000472 15000135257 0020406 0 ustar 00root root 0000000 0000000 docs/__autosummary
docs/_build
docs/make.bat
*.rej
*~
\#*\#
# Compiled python modules.
*.pyc
# Byte-compiled
_pycache__/
.cache/
# Poetry, setuptools, PyPI distribution artifacts.
/*.egg-info
.eggs/
build/
dist/
poetry.lock
# Tests
.pytest_cache/
# Type checking
.pytype/
# Other
*.DS_Store
# PyCharm
.idea
google-ml_collections-ad8f348/LICENSE 0000664 0000000 0000000 00000026135 15000135257 0017427 0 ustar 00root root 0000000 0000000
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. google-ml_collections-ad8f348/README.md 0000664 0000000 0000000 00000051234 15000135257 0017677 0 ustar 00root root 0000000 0000000 # ML Collections
ML Collections is a library of Python Collections designed for ML use cases.
[](https://ml-collections.readthedocs.io/en/latest/?badge=latest)
[](https://badge.fury.io/py/ml-collections)
[](https://github.com/google/ml_collections/actions/workflows/pytest_and_autopublish.yml)
## ConfigDict
The two classes called `ConfigDict` and `FrozenConfigDict` are "dict-like" data
structures with dot access to nested elements. Together, they are supposed to be
used as a main way of expressing configurations of experiments and models.
This document describes example usage of `ConfigDict`, `FrozenConfigDict`,
`FieldReference`.
### Features
* Dot-based access to fields.
* Locking mechanism to prevent spelling mistakes.
* Lazy computation.
* FrozenConfigDict() class which is immutable and hashable.
* Type safety.
* "Did you mean" functionality.
* Human readable printing (with valid references and cycles), using valid YAML
format.
* Fields can be passed as keyword arguments using the `**` operator.
* There is one exception to the strong type-safety of the ConfigDict: `int`
values can be passed in to fields of type `float`. In such a case, the value
is type-converted to a `float` before being stored. (Back in the day of
Python 2, there was a similar exception to allow both `str` and `unicode`
values in string fields.)
### Basic Usage
```python
from ml_collections import config_dict
cfg = config_dict.ConfigDict()
cfg.float_field = 12.6
cfg.integer_field = 123
cfg.another_integer_field = 234
cfg.nested = config_dict.ConfigDict()
cfg.nested.string_field = 'tom'
print(cfg.integer_field) # Prints 123.
print(cfg['integer_field']) # Prints 123 as well.
try:
cfg.integer_field = 'tom' # Raises TypeError as this field is an integer.
except TypeError as e:
print(e)
cfg.float_field = 12 # Works: `Int` types can be assigned to `Float`.
cfg.nested.string_field = u'bob' # `String` fields can store Unicode strings.
print(cfg)
```
### FrozenConfigDict
A `FrozenConfigDict`is an immutable, hashable type of `ConfigDict`:
```python
from ml_collections import config_dict
initial_dictionary = {
'int': 1,
'list': [1, 2],
'tuple': (1, 2, 3),
'set': {1, 2, 3, 4},
'dict_tuple_list': {'tuple_list': ([1, 2], 3)}
}
cfg = config_dict.ConfigDict(initial_dictionary)
frozen_dict = config_dict.FrozenConfigDict(initial_dictionary)
print(frozen_dict.tuple) # Prints tuple (1, 2, 3)
print(frozen_dict.list) # Prints tuple (1, 2)
print(frozen_dict.set) # Prints frozenset {1, 2, 3, 4}
print(frozen_dict.dict_tuple_list.tuple_list[0]) # Prints tuple (1, 2)
frozen_cfg = config_dict.FrozenConfigDict(cfg)
print(frozen_cfg == frozen_dict) # True
print(hash(frozen_cfg) == hash(frozen_dict)) # True
try:
frozen_dict.int = 2 # Raises TypeError as FrozenConfigDict is immutable.
except AttributeError as e:
print(e)
# Converting between `FrozenConfigDict` and `ConfigDict`:
thawed_frozen_cfg = config_dict.ConfigDict(frozen_dict)
print(thawed_frozen_cfg == cfg) # True
frozen_cfg_to_cfg = frozen_dict.as_configdict()
print(frozen_cfg_to_cfg == cfg) # True
```
### FieldReferences and placeholders
A `FieldReference` is useful for having multiple fields use the same value. It
can also be used for [lazy computation](#lazy-computation).
You can use `placeholder()` as a shortcut to create a `FieldReference` (field)
with a `None` default value. This is useful if a program uses optional
configuration fields.
```python
from ml_collections import config_dict
placeholder = config_dict.FieldReference(0)
cfg = config_dict.ConfigDict()
cfg.placeholder = placeholder
cfg.optional = config_dict.placeholder(int)
cfg.nested = config_dict.ConfigDict()
cfg.nested.placeholder = placeholder
try:
cfg.optional = 'tom' # Raises Type error as this field is an integer.
except TypeError as e:
print(e)
cfg.optional = 1555 # Works fine.
cfg.placeholder = 1 # Changes the value of both placeholder and
# nested.placeholder fields.
print(cfg)
```
Note that the indirection provided by `FieldReference`s will be lost if accessed
through a `ConfigDict`.
```python
from ml_collections import config_dict
placeholder = config_dict.FieldReference(0)
cfg.field1 = placeholder
cfg.field2 = placeholder # This field will be tied to cfg.field1.
cfg.field3 = cfg.field1 # This will just be an int field initialized to 0.
```
### Lazy computation
Using a `FieldReference` in a standard operation (addition, subtraction,
multiplication, etc...) will return another `FieldReference` that points to the
original's value. You can use `FieldReference.get()` to execute the operations
and get the reference's computed value, and `FieldReference.set()` to change the
original reference's value.
```python
from ml_collections import config_dict
ref = config_dict.FieldReference(1)
print(ref.get()) # Prints 1
add_ten = ref.get() + 10 # ref.get() is an integer and so is add_ten
add_ten_lazy = ref + 10 # add_ten_lazy is a FieldReference - NOT an integer
print(add_ten) # Prints 11
print(add_ten_lazy.get()) # Prints 11 because ref's value is 1
# Addition is lazily computed for FieldReferences so changing ref will change
# the value that is used to compute add_ten.
ref.set(5)
print(add_ten) # Prints 11
print(add_ten_lazy.get()) # Prints 15 because ref's value is 5
```
If a `FieldReference` has `None` as its original value, or any operation has an
argument of `None`, then the lazy computation will evaluate to `None`.
We can also use fields in a `ConfigDict` in lazy computation. In this case a
field will only be lazily evaluated if `ConfigDict.get_ref()` is used to get it.
```python
from ml_collections import config_dict
config = config_dict.ConfigDict()
config.reference_field = config_dict.FieldReference(1)
config.integer_field = 2
config.float_field = 2.5
# No lazy evaluatuations because we didn't use get_ref()
config.no_lazy = config.integer_field * config.float_field
# This will lazily evaluate ONLY config.integer_field
config.lazy_integer = config.get_ref('integer_field') * config.float_field
# This will lazily evaluate ONLY config.float_field
config.lazy_float = config.integer_field * config.get_ref('float_field')
# This will lazily evaluate BOTH config.integer_field and config.float_Field
config.lazy_both = (config.get_ref('integer_field') *
config.get_ref('float_field'))
config.integer_field = 3
print(config.no_lazy) # Prints 5.0 - It uses integer_field's original value
print(config.lazy_integer) # Prints 7.5
config.float_field = 3.5
print(config.lazy_float) # Prints 7.0
print(config.lazy_both) # Prints 10.5
```
#### Changing lazily computed values
Lazily computed values in a ConfigDict can be overridden in the same way as
regular values. The reference to the `FieldReference` used for the lazy
computation will be lost and all computations downstream in the reference graph
will use the new value.
```python
from ml_collections import config_dict
config = config_dict.ConfigDict()
config.reference = 1
config.reference_0 = config.get_ref('reference') + 10
config.reference_1 = config.get_ref('reference') + 20
config.reference_1_0 = config.get_ref('reference_1') + 100
print(config.reference) # Prints 1.
print(config.reference_0) # Prints 11.
print(config.reference_1) # Prints 21.
print(config.reference_1_0) # Prints 121.
config.reference_1 = 30
print(config.reference) # Prints 1 (unchanged).
print(config.reference_0) # Prints 11 (unchanged).
print(config.reference_1) # Prints 30.
print(config.reference_1_0) # Prints 130.
```
#### Cycles
You cannot create cycles using references. Fortunately
[the only way](#changing-lazily-computed-values) to create a cycle is by
assigning a computed field to one that *is not* the result of computation. This
is forbidden:
```python
from ml_collections import config_dict
config = config_dict.ConfigDict()
config.integer_field = 1
config.bigger_integer_field = config.get_ref('integer_field') + 10
try:
# Raises a MutabilityError because setting config.integer_field would
# cause a cycle.
config.integer_field = config.get_ref('bigger_integer_field') + 2
except config_dict.MutabilityError as e:
print(e)
```
#### One-way references
One gotcha with `get_ref` is that it creates a bi-directional dependency when no operations are performed on the value.
```python
from ml_collections import config_dict
config = config_dict.ConfigDict()
config.reference = 1
config.reference_0 = config.get_ref('reference')
config.reference_0 = 2
print(config.reference) # Prints 2.
print(config.reference_0) # Prints 2.
```
This can be avoided by using `get_oneway_ref` instead of `get_ref`.
```python
from ml_collections import config_dict
config = config_dict.ConfigDict()
config.reference = 1
config.reference_0 = config.get_oneway_ref('reference')
config.reference_0 = 2
print(config.reference) # Prints 1.
print(config.reference_0) # Prints 2.
```
### Advanced usage
Here are some more advanced examples showing lazy computation with different
operators and data types.
```python
from ml_collections import config_dict
config = config_dict.ConfigDict()
config.float_field = 12.6
config.integer_field = 123
config.list_field = [0, 1, 2]
config.float_multiply_field = config.get_ref('float_field') * 3
print(config.float_multiply_field) # Prints 37.8
config.float_field = 10.0
print(config.float_multiply_field) # Prints 30.0
config.longer_list_field = config.get_ref('list_field') + [3, 4, 5]
print(config.longer_list_field) # Prints [0, 1, 2, 3, 4, 5]
config.list_field = [-1]
print(config.longer_list_field) # Prints [-1, 3, 4, 5]
# Both operands can be references
config.ref_subtraction = (
config.get_ref('float_field') - config.get_ref('integer_field'))
print(config.ref_subtraction) # Prints -113.0
config.integer_field = 10
print(config.ref_subtraction) # Prints 0.0
```
### Equality checking
You can use `==` and `.eq_as_configdict()` to check equality among `ConfigDict`
and `FrozenConfigDict` objects.
```python
from ml_collections import config_dict
dict_1 = {'list': [1, 2]}
dict_2 = {'list': (1, 2)}
cfg_1 = config_dict.ConfigDict(dict_1)
frozen_cfg_1 = config_dict.FrozenConfigDict(dict_1)
frozen_cfg_2 = config_dict.FrozenConfigDict(dict_2)
# True because FrozenConfigDict converts lists to tuples
print(frozen_cfg_1.items() == frozen_cfg_2.items())
# False because == distinguishes the underlying difference
print(frozen_cfg_1 == frozen_cfg_2)
# False because == distinguishes these types
print(frozen_cfg_1 == cfg_1)
# But eq_as_configdict() treats both as ConfigDict, so these are True:
print(frozen_cfg_1.eq_as_configdict(cfg_1))
print(cfg_1.eq_as_configdict(frozen_cfg_1))
```
### Equality checking with lazy computation
Equality checks see if the computed values are the same. Equality is satisfied
if two sets of computations are different as long as they result in the same
value.
```python
from ml_collections import config_dict
cfg_1 = config_dict.ConfigDict()
cfg_1.a = 1
cfg_1.b = cfg_1.get_ref('a') + 2
cfg_2 = config_dict.ConfigDict()
cfg_2.a = 1
cfg_2.b = cfg_2.get_ref('a') * 3
# True because all computed values are the same
print(cfg_1 == cfg_2)
```
### Locking and copying
Here is an example with `lock()` and `deepcopy()`:
```python
import copy
from ml_collections import config_dict
cfg = config_dict.ConfigDict()
cfg.integer_field = 123
# Locking prohibits the addition and deletion of new fields but allows
# modification of existing values.
cfg.lock()
try:
cfg.intagar_field = 124 # Modifies the wrong field
except AttributeError as e: # Raises AttributeError and suggests valid field.
print(e)
with cfg.unlocked():
cfg.intagar_field = 1555 # Works fine.
# Get a copy of the config dict.
new_cfg = copy.deepcopy(cfg)
new_cfg.integer_field = -123 # Works fine.
print(cfg)
print(new_cfg)
```
Output:
```
'Key "intagar_field" does not exist and cannot be added since the config is locked. Other fields present: "{\'integer_field\': 123}"\nDid you mean "integer_field" instead of "intagar_field"?'
intagar_field: 1555
integer_field: 123
intagar_field: 1555
integer_field: -123
```
### Dictionary attributes and initialization
```python
from ml_collections import config_dict
referenced_dict = {'inner_float': 3.14}
d = {
'referenced_dict_1': referenced_dict,
'referenced_dict_2': referenced_dict,
'list_containing_dict': [{'key': 'value'}],
}
# We can initialize on a dictionary
cfg = config_dict.ConfigDict(d)
# Reference structure is preserved
print(id(cfg.referenced_dict_1) == id(cfg.referenced_dict_2)) # True
# And the dict attributes have been converted to ConfigDict
print(type(cfg.referenced_dict_1)) # ConfigDict
# However, the initialization does not look inside of lists, so dicts inside
# lists are not converted to ConfigDict
print(type(cfg.list_containing_dict[0])) # dict
```
### More Examples
For more examples, take a look at
[`ml_collections/config_dict/examples/`](https://github.com/google/ml_collections/tree/master/ml_collections/config_dict/examples)
For examples and gotchas specifically about initializing a ConfigDict, see
[`ml_collections/config_dict/examples/config_dict_initialization.py`](https://github.com/google/ml_collections/blob/master/ml_collections/config_dict/examples/config_dict_initialization.py).
## Config Flags
This library adds flag definitions to `absl.flags` to handle config files. It
does not wrap `absl.flags` so if using any standard flag definitions alongside
config file flags, users must also import `absl.flags`.
Currently, this module adds two new flag types, namely `DEFINE_config_file`
which accepts a path to a Python file that generates a configuration, and
`DEFINE_config_dict` which accepts a configuration directly. Configurations are
dict-like structures (see [ConfigDict](#configdict)) whose nested elements
can be overridden using special command-line flags. See the examples below
for more details.
### Usage
Use `ml_collections.config_flags` alongside `absl.flags`. For
example:
`script.py`:
```python
from absl import app
from absl import flags
from ml_collections import config_flags
_CONFIG = config_flags.DEFINE_config_file('my_config')
_MY_FLAG = flags.DEFINE_integer('my_flag', None)
def main(_):
print(_CONFIG.value)
print(_MY_FLAG.value)
if __name__ == '__main__':
app.run(main)
```
`config.py`:
```python
# Note that this is a valid Python script.
# get_config() can return an arbitrary dict-like object. However, it is advised
# to use ml_collections.config_dict.ConfigDict.
# See ml_collections/config_dict/examples/config_dict_basic.py
from ml_collections import config_dict
def get_config():
config = config_dict.ConfigDict()
config.field1 = 1
config.field2 = 'tom'
config.nested = config_dict.ConfigDict()
config.nested.field = 2.23
config.tuple = (1, 2, 3)
return config
```
Warning: If you are using a pickle-based distributed programming framework such
as [Launchpad](https://github.com/deepmind/launchpad#readme), be aware of
limitations on the structure of this script that are [described below]
(#config_files_and_pickling).
Now, after running:
```bash
python script.py --my_config=config.py \
--my_config.field1=8 \
--my_config.nested.field=2.1 \
--my_config.tuple='(1, 2, (1, 2))'
```
we get:
```
field1: 8
field2: tom
nested:
field: 2.1
tuple: !!python/tuple
- 1
- 2
- !!python/tuple
- 1
- 2
```
Usage of `DEFINE_config_dict` is similar to `DEFINE_config_file`, the main
difference is the configuration is defined in `script.py` instead of in a
separate file.
`script.py`:
```python
from absl import app
from ml_collections import config_dict
from ml_collections import config_flags
config = config_dict.ConfigDict()
config.field1 = 1
config.field2 = 'tom'
config.nested = config_dict.ConfigDict()
config.nested.field = 2.23
config.tuple = (1, 2, 3)
_CONFIG = config_flags.DEFINE_config_dict('my_config', config)
def main(_):
print(_CONFIG.value)
if __name__ == '__main__':
app.run()
```
`config_file` flags are compatible with the command-line flag syntax. All the
following options are supported for non-boolean values in configurations:
* `-(-)config.field=value`
* `-(-)config.field value`
Options for boolean values are slightly different:
* `-(-)config.boolean_field`: set boolean value to True.
* `-(-)noconfig.boolean_field`: set boolean value to False.
* `-(-)config.boolean_field=value`: `value` is `true`, `false`, `True` or
`False`.
Note that `-(-)config.boolean_field value` is not supported.
### Parameterising the get_config() function
It's sometimes useful to be able to pass parameters into `get_config`, and
change what is returned based on this configuration. One example is if you are
grid searching over parameters which have a different hierarchical structure -
the flag needs to be present in the resulting ConfigDict. It would be possible
to include the union of all possible leaf values in your ConfigDict,
but this produces a confusing config result as you have to remember which
parameters will actually have an effect and which won't.
A better system is to pass some configuration, indicating which structure of
ConfigDict should be returned. An example is the following config file:
```python
from ml_collections import config_dict
def get_config(config_string):
possible_structures = {
'linear': config_dict.ConfigDict({
'model_constructor': 'snt.Linear',
'model_config': config_dict.ConfigDict({
'output_size': 42,
}),
'lstm': config_dict.ConfigDict({
'model_constructor': 'snt.LSTM',
'model_config': config_dict.ConfigDict({
'hidden_size': 108,
})
})
}
return possible_structures[config_string]
```
The value of `config_string` will be anything that is to the right of the first
colon in the config file path, if one exists. If no colon exists, no value is
passed to `get_config` (producing a TypeError if `get_config` expects a value).
The above example can be run like:
```bash
python script.py -- --config=path_to_config.py:linear \
--config.model_config.output_size=256
```
or like:
```bash
python script.py -- --config=path_to_config.py:lstm \
--config.model_config.hidden_size=512
```
### Additional features
* Loads any valid python script which defines `get_config()` function
returning any python object.
* Automatic locking of the loaded object, if the loaded object defines a
callable `.lock()` method.
* Supports command-line overriding of arbitrarily nested values in dict-like
objects (with key/attribute based getters/setters) of the following types:
* `int`
* `float`
* `bool`
* `str`
* `tuple` (but **not** `list`)
* `enum.Enum`
* Overriding is type safe.
* Overriding of a `tuple` can be done by passing in the `tuple` value as a
string (see the example in the [Usage](#usage) section).
* The overriding `tuple` object can be of a different length and have
different item types than the original. Nested tuples are also supported.
### Config Files and Pickling {#config_files_and_pickling}
This is likely to be troublesome:
```python {.bad}
@dataclasses.dataclass
class MyRecord:
num_balloons: int
color: str
def get_config():
return MyRecord(num_balloons=99, color='red')
```
This is not:
```python {.good}
def get_config():
@dataclasses.dataclass
class MyRecord:
num_balloons: int
color: str
return MyRecord(num_balloons=99, color='red')
```
#### Explanation
A config file is a Python module but it is not imported through Python's usual
module-importing mechanism.
Meanwhile, serialization libraries such as [`cloudpickle`](
https://github.com/cloudpipe/cloudpickle#readme) (which is used by [Launchpad](
https://github.com/deepmind/launchpad#readme)) and [Apache Beam](
https://beam.apache.org/) expect to be able to pickle an object without also
pickling every type to which it refers, on the assumption that types defined
at module scope can later be reconstructed simply by re-importing the modules
in which they are defined.
That assumption does not hold for a type that is defined at module scope in a
config file, because the config file can't be imported the usual way. The
symptom of this will be an `ImportError` when unpickling an object.
The treatment is to move types from module scope into `get_config()` so that
they will be serialized along with the values that have those types.
## Authors
* Sergio Gómez Colmenarejo - sergomez@google.com
* Wojciech Marian Czarnecki - lejlot@google.com
* Nicholas Watters
* Mohit Reddy - mohitreddy@google.com
google-ml_collections-ad8f348/docs/ 0000775 0000000 0000000 00000000000 15000135257 0017343 5 ustar 00root root 0000000 0000000 google-ml_collections-ad8f348/docs/CONTRIBUTING.md 0000664 0000000 0000000 00000002117 15000135257 0021575 0 ustar 00root root 0000000 0000000 # How to Contribute
We'd love to accept your patches and contributions to this project. There are
just a few small guidelines you need to follow.
## Contributor License Agreement
Contributions to this project must be accompanied by a Contributor License
Agreement (CLA). You (or your employer) retain the copyright to your
contribution; this simply gives us permission to use and redistribute your
contributions as part of the project. Head over to
to see your current agreements on file or
to sign a new one.
You generally only need to submit a CLA once, so if you've already submitted one
(even if it was for a different project), you probably don't need to do it
again.
## Code reviews
All submissions, including submissions by project members, require review. We
use GitHub pull requests for this purpose. Consult
[GitHub Help](https://help.github.com/articles/about-pull-requests/) for more
information on using pull requests.
## Community Guidelines
This project follows
[Google's Open Source Community Guidelines](https://opensource.google/conduct/).
google-ml_collections-ad8f348/docs/Makefile 0000664 0000000 0000000 00000001172 15000135257 0021004 0 ustar 00root root 0000000 0000000 # Minimal makefile for Sphinx documentation
#
# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = .
BUILDDIR = _build
# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
.PHONY: help Makefile
# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
google-ml_collections-ad8f348/docs/README.md 0000664 0000000 0000000 00000000460 15000135257 0020622 0 ustar 00root root 0000000 0000000 # ML Collections docs
The ML Collections documentation can be found here: https://ml-collections.readthedocs.io/en/latest/
# How to build the docs
1. Install the requirements in ml_collections/docs/requirements.txt.
2. Ensure `pandoc` is installed.
3. Run `make html` to locally generate documentation. google-ml_collections-ad8f348/docs/conf.py 0000664 0000000 0000000 00000005467 15000135257 0020656 0 ustar 00root root 0000000 0000000 # Copyright 2024 The ML Collections Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html
# -- Path setup --------------------------------------------------------------
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
# import os
# import sys
# sys.path.insert(0, os.path.abspath('.'))
import os
import sys
sys.path.insert(0, os.path.abspath('..'))
# -- Project information -----------------------------------------------------
project = 'ml_collections'
copyright = '2020, The ML Collection Authors'
author = 'The ML Collection Authors'
# The full version, including alpha/beta/rc tags
release = '0.1.0'
# -- General configuration ---------------------------------------------------
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
'sphinx.ext.autodoc',
'sphinx.ext.autosummary',
'sphinx.ext.intersphinx',
'sphinx.ext.mathjax',
'sphinx.ext.napoleon',
'sphinx.ext.viewcode',
'nbsphinx',
'recommonmark',
]
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
autosummary_generate = True
master_doc = 'index'
# -- Options for HTML output -------------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = 'sphinx_rtd_theme'
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static'] google-ml_collections-ad8f348/docs/config_dict.rst 0000664 0000000 0000000 00000002053 15000135257 0022345 0 ustar 00root root 0000000 0000000 ml_collections.config_dict package
==================================
.. currentmodule:: ml_collections.config_dict
.. automodule:: ml_collections.config_dict
ConfigDict class
----------------
.. autoclass:: ConfigDict
:members: __init__, is_type_safe, convert_dict, lock, is_locked, unlock,
get, get_oneway_ref, items, iteritems, keys, iterkeys, values, itervalues,
eq_as_configdict, to_yaml, to_json, to_json_best_effort, to_dict,
copy_and_resolve_references, unlocked, ignore_type, get_type, update,
update_from_flattened_dict
FrozenConfigDict class
----------------------
.. autoclass:: FrozenConfigDict
:members: __init__
FieldReference class
--------------------
.. autoclass:: FieldReference
:members: __init__, has_cycle, set, empty, get, get_type, identity, to_int,
to_float, to_str
Additional Methods
------------------
.. autosummary::
:toctree: __autosummary
create
placeholder
required_placeholder
recursive_rename
CustomJSONEncoder
JSONDecodeError
MutabilityError
RequiredValueError google-ml_collections-ad8f348/docs/config_dict_examples.rst 0000664 0000000 0000000 00000001614 15000135257 0024245 0 ustar 00root root 0000000 0000000 ml_collections.config_dict examples
===================================
.. toctree::
:maxdepth: 1
ConfigDict Basic
ConfigDict Advanced
ConfigDict Lock
ConfigDict Placeholder
Field Reference
FrozenConfigDict